Democratizing data science through data science training.
نویسندگان
چکیده
The biomedical sciences have experienced an explosion of data which promises to overwhelm many current practitioners. Without easy access to data science training resources, biomedical researchers may find themselves unable to wrangle their own datasets. In 2014, to address the challenges posed such a data onslaught, the National Institutes of Health (NIH) launched the Big Data to Knowledge (BD2K) initiative. To this end, the BD2K Training Coordinating Center (TCC; bigdatau.org) was funded to facilitate both in-person and online learning, and open up the concepts of data science to the widest possible audience. Here, we describe the activities of the BD2K TCC and its focus on the construction of the Educational Resource Discovery Index (ERuDIte), which identifies, collects, describes, and organizes online data science materials from BD2K awardees, open online courses, and videos from scientific lectures and tutorials. ERuDIte now indexes over 9,500 resources. Given the richness of online training materials and the constant evolution of biomedical data science, computational methods applying information retrieval, natural language processing, and machine learning techniques are required - in effect, using data science to inform training in data science. In so doing, the TCC seeks to democratize novel insights and discoveries brought forth via large-scale data science training.
منابع مشابه
Democratizing Data Science Effecting positive social change with data science
The effective translation of data into novel insights, discoveries, and solutions, also known as data science, has enormous potential to bring about positive social change. In this paper, we propose ways to “democratize data science”: that is, to allocate the power of data science to society’s greatest needs. Two underlying challenges are 1) the misalignment of economic incentives to aspiring d...
متن کاملDemocratizing nanotech, then and now.
In October 2006, in the first issue of this journal, I described the idea of ‘democratizing science’ — a state of affairs in which non-experts have active and constructive roles in science policy decisions1. At that time, there were expectations that nanotechnology would be a laboratory for experimenting with the idea of democratizing science. From this came an impressive battery of focus group...
متن کاملData Management in Dynamic Environment-driven Computational Science
Advances in numerical modeling, computational hardware, and problem solving environments have driven the growth of computational science over the past decades. Science gateways, based on service oriented architectures and scientific workflows, provide yet another step in democratizing access to advanced numerical and scientific tools, computational resource and massive data storage, and fosteri...
متن کاملPerspectives on Surgical Data Science
The availability of large amounts of data together with advances in analytical techniques afford an opportunity to address difficult challenges in ensuring that healthcare is safe, effective, efficient, patient-centered, equitable, and timely. Surgical care and training stand to tremendously gain through surgical data science. Herein, we discuss a few perspectives on the scope and objectives fo...
متن کاملDemocratizing Health Data for Translational Research.
There is an expanding and intensive focus on the accessibility, reproducibility, and rigor of basic, clinical, and translational research. This focus complements the need to identify sustainable ways to generate actionable research results that improve human health. The principles and practices of open science offer a promising path to address both issues by facilitating: 1) increased transpare...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing
دوره 23 شماره
صفحات -
تاریخ انتشار 2018